Generic Text Summarization Using WordNet
نویسندگان
چکیده
This paper presents a WordNet based approach to text summarization. The document to be summarized is used to extract a “relevant” sub-graph from the WordNet graph. Weights are assigned to each node of this sub-graph using a strategy similar to the Google Pageranking algorithm. These weights capture the relevance of the respective synsets with respect to the whole document. A matrix in which each row repesents a sentence and each column a node of the sub-graph (i.e., a synset) is created. Principal Component Analysis is performed on this matrix to help extract the sentences for the summary. Our approach is generic unlike most previous approaches which address specific genres of documents like news articles and biographies. Testing our system on the standard DUC2002 extracts shows that our results are promising and comparable to existing summarizers.
منابع مشابه
An Extractive Approach of Text Summarization of Assamese using WordNet
Automatic text summarization means finding out the summary of one or more document by a computer program. The output text or the summary should contain the most important points of the original text without changing its meaning. In this report, we present an extractive approach of Text summarization of Assamese, a free word order inflectional Indic language, using WordNet. From our experiment, ...
متن کاملWordNet-based Summarization of Unstructured Document
This paper presents an improved and practical approach to automatically summarizing unstructured document by extracting the most relevant sentences from plain text or html version of original document. This technique proposed is based upon Key Sentences using statistical method and WordNet. Experimental results show that our approach compares favourably to a commercial text summarizer, and some...
متن کاملAutomatic Text Summarization Using Lexical Clustering
The goal of automatic text summarization is to reduce the size of a document while preserving its content. We investigate a summarization method which uses not only statistical features but also the contextual meaning of documents by using lexical clustering. We present a new method to compute lexical cluster in a text without high cost knowledge resources; the WordNet thesaurus. Summarization ...
متن کاملWordNet and Automated Text Summarization
Proposals for text classification and information retrieval have been recently presented making use of the WordNet ontology. Generally, this methodology requires statistical induction of synset clusters and entails costly training of specific key domains. The present proposal intends to show that a simple recursive evaluation procedure and WordNet are rich enough to obtain useful results in tex...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004